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ABSTRACT 


The  Armed  Services  Vocational  Aptitude 
Battery  (ASVAB)  is  highly  oriented  to 
math  and  verbal  content  areas.  New 
predictor  tests  that  are  unique  relative 
to  the  current  ASVAB  subtests  may  have 
potential  for  improving  predictive 
validity.  The  purpose  of  this  research 
memorandum  is  to  investigate  the 
incremental  validity  of  several  new 
tests  that  were  administered  as  part  of 
the  Marine  Corps  Job  Performance 
Measurement  project  fof  the  infantry 
occupational  field. 
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EXECUTIVE  SUMMARY 


The  Armed  Services  Vocational  Aptitude  Battery  (ASVAB)  is  the  test 
used  by  the  military  services  to  select  and  classify  recruits.  The 
ASVAB  is  composed  of  ten  subtests  that  measure  four  general  content 
areas:  verbal,  mathematical,  technical,  and  speed.  The  purpose  of  this 
research  memorandum  is  to  investigate  several  new  tests  that  differ  in 
content  and  scope  from  the  current  ASVAB.  Each  new  test  was  judged 
relative  to  its  ability  to  improve  the  prediction  of  infantry 
performance  by  the  ASVAB. 

The  new  tests  included  paper-and-pencil  measures  of  spatial  ability 
(space  perception  (SP) ,  reasoning  (RS) ,  and  assembling  objects  (AS)),  a 
video -firing  test  (VF) ,  and  a  background  questionnaire  (Armed  Services 
Applicant  Profile- -ASAP) .  The  measures  of  infantry  performance  were 
developed  for  or  collected  as  part  of  the  Marine  Corps  Job  Performance 
Measurement  (JPM)  project:  a  hands-on  performance  test  (HOPT) ,  a 
written  job  knowledge  test  (JKT) ,  proficiency  marks  (PRO),  and  training 
grades  from  the  school  of  infantry  (GPA) . 

Examinees  were  first- term  infantrymen  from  four  military 
occupational  specialties  (MOSs) .  Over  1,000  riflemen  were  tested,  and 
about  300  Marines  in  three  other  infantry  specialties  were  examined: 
machine gunner ,  mortarman,  and  assaultman.  Two  days  were  required  for 
each  Marine  to  complete  all  performance  testing. 

RESULTS 


The  estimation  of  validity  coefficients  is  influenced  by  a  variety 
of  factors:  restriction  of  score  distributions  due  to  the  selection 
process ,  shrinkage  in  multiple  correlations  when  applying  optimal 
regression  weights  to  other  samples,  criterion  unreliability,  time  of 
administration  for  the  predictors,  etc.  The  impact  of  these  factors  as 
well  as  sampling  errors  on  validity  coefficients  is  even  further 
magnified  when  the  primary  issue  is  the  difference  between  validity 
coefficients.  Efforts  were  taken  to  account  for  several  potential  error 
sources  in  the  estimation  of  validity  coefficients. 


The  multiple  correlations  between  all  ASVAB  subtests  and  each 
performance  criterion  were  computed  to  provide  the  base  against  which 
increments  in  validity  by  the  new  tests  would  be  judged.  These  multiple 
correlations  showed  that  ASVAB  was  highly  related  to  JKT,  HOPT,  and 
GPA.  The  ASVAB  was  moderately  related  to  PRO.  Figure  I 'shows  both  the 
sample  and  range -corrected  ASVAB  validity  bases  (computed  for  the 
enlistment-  ASVAB  and  also  for  a  concurrently  administered  ASVAB)  against 
hands-on  performance.  These  ASVAB  bases  were  also  computed  for  the 
other  performance  criteria.  The  new  tests  had  to  demonstrate 
improvements  in  validity  above  and  beyond  these  levels  that  ASVAB  is 
currently  able  to  achieve.  For  the  infantry  rifleman  hands-on  test,  the 
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VF  test  improved  the  ASVAB  validity  by  0.015  to  0.03  validity  points. 

The  incremental  validities  against  rifleman  hands-on  performance  for 
each  new  predictor  are  plotted  in  figure  I. 

Table  I  highlights  the  best  single  new  predictor  test  against  each 
criterion  for  all  four  specialties.  Several  new  predictor  tests 
resulted  in  the  largest  increments  in  validity  against  HOPT.  These 
findings  were  consistent  with  the  differences  in  job  requirements,  which 
were  reflected  in  differences  in  hands-on  test  content  for  these 
specialties.  Part  of  the  hands-on  test  for  the  rifleman  specialty 
required  each  Marine  to  negotiate  an  unknown  trail  as  if  on  a  squad 
patrol  and  to  engage  popup  targets  with  the  M16A2  rifle.  The  prediction 
of  accurately  hitting  these  targets  and  other  rifleman  tasks  was 
improved  by  the  VF  test.  For  the  assaultman  MOS,  each  Marine  was 
required  to  fire  the  Launch  Effects  Trainer  (LET) ,  a  device  that 
simulates  firing  of  the  Dragon  missile.  Again,  the  VF  test  was  one  of 
the  better  new  predictors  in  improving  the  assaultman  validity;  AS  also 
was  found  to  enhance  the  validity.  Job  requirements  for  the  machine- 
gunner  and  mortarman  specialties  tended  to  be  more  spatially  oriented. 
Machine gunners  were  required  to  establish  intersecting  fields  of  fire  as 
well  as  to  prepare  range  cards  that  document  direction,  elevation,  and 
range  of  targets.  The  space  perception  (SP)  test  was  found  to  bo  the 
best  new  predictor  in  improving  the  prediction  of  machine gunner  job 
performance.  The  mortarman  hands-on  test  required  the  Marine  to 
complete  many  procedural  requirements  in  mounting,  boresighting,  and 
laying  the  mortar.  The  assembling  objects  (AS)  test  resulted  in  the 
most  incremental  validity  for  this  specialty. 


Table  I.  Best  new  predictor  test  for  each  criterion 


Criterion 


MOS 

HOPT 

JKT 

PRO 

GPA 

Rifleman  • 

VF 

AS 

ASAP 

VFa 

Machinegunner 

SP 

AS 

ASAP 

Mortarman 

AS 

AS,  SP 

ASAP 

Assaultman 

VF,  AS 

AS 

ASAP 

a.  Validity  results  against  GPA  were  based  on  exam 
inees  from  all  MOSs.  Findings  were  consistent 
for  both  training  locations . 


The  JKTs  for  each  MOS  contained  many  common  infantry  items  although 
each  test  also  had  some  items  that  were  unique.  AS  was  found  to  be  the 
best  new  predictor  test  in  improving  the  validity  against  each  JKT  in 
the  range  of  2  to  4  percent.  Such  a  consistent  outcome  may  partially  be 
due  to  the  similarity  of  test  content  across  these  specialties. 


Figure  1.  Sample  and  corrected  validities  for  enlistment  and  concurrent  aptitude 
against  hands-on  performance  for  infantry  riflemen 


The  ASVAB  only  moderately  predicted  £R0  marks;  the  validity  was 
about  0.35.  The  ASAP  was  consistently  the  best  new -predictor  for 
improving  the  validity  for  these  superviror  ratings.  Despite  signif¬ 
icant  improvements  in  the  prediction  of  PRO  marks,  the  absolute 
validities  were  still  relatively  low. 

Several  corrections  were  made  tc  is  validity  coefficients  to 
account  for  the  impact  of  various  extraneous  sources  of  error.  The 
impact  of  these  corrections  is  noticeable  in  figure  I.  Such  corrections 
tended  to  significantly  reduce  the  gains  in  validity  due  to  the  new 
predictor  test.  Incremental  validities  corrected  for  range  restriction 
were  typically  half  as  large  as  the  sample  incremental  validities. 
Increments  based  on  concurrent  aptitude  were  likewise  less  than  gains 
computed  for  enlistment  aptitude  by  a  factor  of  a  half.  Adjustments  for 
time  in  service  reduced  even  further  the  incremental  gains  (this  impact 
is  not  determinable  from  figure  I) .  The  impact  of  these  error  sources 
highlights  the  potential  for  considerable  overestimation  of  incremental 
validities  if  appropriate  corrections  and  adjustments  are  not  made. 

CONCLUSIONS 

Data  from  the  Marine  Corps  JPM  project  allowed  for  a  thorough 
examination  of  the  measurement  and  prediction  of  infantry  performance. 
These  analyses  showed  that  the  ASVAB  does  an  excellent  job  of  predicting 
a  variety  of  infantry  performance  measures- -hands-on  performance  tests, 
written  job  knowledge  tests,  and  infantry  school  training  grades.  ASVAB 
moderately  predicts  an  infantryman's  proficiency  rating.  The  ability  of 
any  new  predictor  test  to  enhance  the  ASVAB' s  ability  to  predict 
infantry  performance  was  slight  arid  mixed  (except  for  proficiency  marks, 
which  are  questionable  as  objective  measures  of  job  performance). 

The  estimation  of  validity  coefficients  is  influenced  by  a  variety 
of  factors.  Efforts  were  taken  to  account  for  several  potential  error 
sources.  Such  corrections  and  adjustments  tended  to  significantly 
reduce  the  gains  in  validity  due  to  the  new  predictor  test.  Substantial 
overestimation  of  incremental  validities  is  possible  if  appropriate 
corrections  and  adjustments  are  not  made. 

Given  the  variability  of  incremental  validity  estimates  across  MOSs 
and  criteria,  it  is  difficult  to  make  a  strong  recommendation  as  to 
which,  if  any,  of  the  new  predictors  should  be  considered  for  possible 
inclusion  in  the  ASVAB.  Although  similar  gains  found  in  other  research 
have  been  noted  to  possibly  have  considerable  dollar  value,  any  true 
benefit  that  would  result  in  fiscal  savings  has  yet  to  be  demon¬ 
strated.  Therefore ,  the  slight  validity  gains  found  in  these  analyses 
have  yet  to  demonstrate  any  tangible  significance  that  would  positively 
impact  the  overall  manpower  selection  and  classification  process. 
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Even  if  "significant"  increments  in  validity  had  been  noted, 
further  investigation  of  the  measurement  properties  of  any  new  tests  is 
still  required.  For  example,  while  the  video  firing  test  tended  to  be 
one  of  the  better  tests  against  hands-on  performance,  the  test  may  be 
susceptible  to  practice  effects  as  demonstrated  in  the  significant  test' 
retest  gains  over  the  period  of  7-10  days.  Performance  on  such  video 
tests  may  also  be  affected  by  previous  experience  with  video  games  or 
computers.  Such  practice  effects  or  experience  may  possibly  cancel  any 
validity  gains  if  the  test  were  used  for  operational  testing. 

Additional  issues  that  would  need  to  be  researched  include  subgroup 
analysis,  coaching  and  test- taking  strategies,  and  logistical  concerns 
for  implementing  the  test  within  an  operational  testing  program. 

Given  the  challenge  to  improve  the  prediction  of  infantry 
perfoinnance ,  it  was  found  that  larger  percentage  gains  can  be  achieved 
by  refining  the  current  aptitude  composites  or  by  using  an  optimal 
classification  system  based  on  all  ASVAB  subtests  than  can  be  achieved 
by  adding  new  predictor  tests  to  the  ASVAB.  Such  gains  may  be  achieved 
by  simply  correcting  known  inefficiencies  in  the  current  classification 
system.  With  only  minimal  gains  resulting  from  new  predictor  tests  and 
an  unknown  benefit  associated  with  such  small  gains,  it  would  be  more 
prudent  to  concentrate  on  refining  the  existing  classification  system. 
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INTRODUCTION 


The  Armed  Services  Vocational  Aptitude  Battery  (ASVAB)  is  the  test 
used  by  the  military  services  to  select  and  classify  recruits.  The 
ASVAB  is  composed  of  ten  subtests  that  measure  four  general  content 
areas:  verbal,  mathematical,  technical,  and  speed.  Various  aptitude 
composites,  computed  from  the  ten  ASVAB  subtests,  are  used  to  classify 
recruits  into  clusters  of  military  occupational  specialties  (MOSs)  that 
are  most  suited  to  their  aptitudes. 

Various  analyses  have  confirmed  the  four  general  content  areas  of 
the  ASVAB  [1],  although  these  factors  tend  to  be  correlated.  This 
implies  that  the  ASVAB  is  limited  in  the  number  of  dimensions  that  it 
effectively  measures.  To  the  extent  that  military  jobs  are  multi¬ 
dimensional  and  require  a  variety  of  skills  and  abilities,  the  ASVAB  may 
not  be  sensitive  to  the  prediction  of  these  qualities.  The  considera¬ 
tion  of  new  dimensions  that  might  supplement  the  existing  ASVAB  by 
expanding  its  range  of  predictors  may  hold  significant  promise  for 
improving  the  overall  selection  and  classif ication  system. 

However,  the  consideration  of  new  predictors  is  unjustified  if 
there  is  not  a  similar  concern  for  the  performance  measure  against  which 
the  new  tests  are  to  be  validated.  The  ability  of  the  ASVAB  to  predict 
the  traditional  military  performance  criterion  of  training  grades  is 
typically  good  due  to  their  shared  academic  nature.  Training  grades  are 
often  based  on  written  examinations  of  job  knowledge  obtained  in  a 
classroom  setting.  Persons  performing  well  on  written  predictor  tests 
also  tend  to  perform  well  on  written  criterion  tests.  The  possibility 
of  additional  (or  different)  predictors  significantly  improving  the 
ASVAB- training  grade  relationship  across  a  variety  of  jobs  or  clusters 
of  jobs  is  unlikely. 

The  joint- service  Jcb  Performance  Measurement  (JPM)  project  offers 
a  unique  opportunity  for  the  validation  of  new  predictor  tests.  A 
primary  purpose  of  the  JPM  project  has  been  to  develop  objective  and 
standardized  measures  of  job  performance  that  reflect  the  broad  range  of 
military  job  requirements.  The  expanded  scope  of  the  hands-on 
performance  tests  will  measure  the  unique  abilities  that  are  needed  in 
the  work  setting  but  that  are  not  necessarily  required  for  academic 
success.  In  this  way,  the  services  will  be  able  to  differentially 


1.  Efforts  within  the  joint-service  computerized  adaptive  testing  (CAT) 
project  for  the  ASVAB  are  examining  the  use  of  computers  for  expanding 
the  measurement  of  aptitudes  beyond  those  currently  assessed  by  the 
paper -and -pencil  ASVAB.  The  Defense  Advisory  Committee  on  Military 
Personnel  Testing  has  noted  that,  "to  a  significant  extent,  the 
practical  value  of  a  nationwide  CAT  system  will  depend  on  the  success  of 
this  research  effort  [investigation  of  additional  predictive  validity  of 
new  predictor  tests]"  [2,  p.  21]. 
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associate  the  skills  and  abilities  required  in  various  jobs  with  the 
predictors  of  those  abilities  so  that  the  match  of  the  person  and  job 
can  possibly  be  improved. 


Without  simultaneous  research  in  both  the  predictor  and  criterion 
realms,  analyses  of  incremental  validity  for  any  new  predictor  tests  may 
be  somewhat  misleading  and  will  certainly  be  incomplete.  By  limiting 
the  focus  to  the  existing  ASVAB  sub tests  predicting  the  more  complete 
criterion  measures  of  the  JPM  project,  only  that  part  of  job  performance 
that  is  the  product  of  the  four  highly  related  content  factors  will  be 
illuminated.  The  prediction  of  any  differential  abilities  required  for 
successful  job  performance  will  potentially  be  masked  due  to  the 
inadequacy  of  ASVAB  to  predict  those  dimensions  (and  therefore  appear  as 
a  lack  of  relationship  with  the  ASVAB).  Conversely,  research  involving 
new  predictors  validated  against  traditional  performance  measures  will 
possibly  be  fruitless  as  well.  Increments  in  validity  against  training 
criteria  may  be  hard  to  obtain  or  may  even  restrict  the  types  of  new 
predictors  to  tests  that  are  not  overly  different  from  the  current  math 
and  verbal  orientation  of  the  ASVAB. 


The  purpose  of  this  research  memorandum  is  to  investigate  the 
ability  of  several  new  predictor  tests  to  improve  the  prediction  of 
infantry  performance  beyond  what  the  ASVAB  is  currently  able  to 
achieve.  The  new  predictor  tests  were  administered  as  part  of  the 
Marine  Corps  JPM  project.  These  tests  included  paper- and-pencil 
measures  of  spatial  ability,  a  video- firing  test,  and  a  background 
questionnaire.  Increments  in  validity  due  to  these  new  tests  were 
judged  relative  to  the  complete  battery  of  ASVAB  subtests.  Two  sources 
of  aptitude  scores  were  examined:  ASVAB  at  time  of  enlistment  into  the 
Marine  Corps  and  a  concurrent  ASVAB  administered  as  part  of  the 
project.  Four  different  performance  criteria  were  also  examined: 
hands-on  job  performance  tests,  written  job  knowledge  tests,  proficiency 
marks  (Marine  Corps  operational  supervisory  ratings),  and  final  course 
grades  in  the  infantry  training  school.  Reliability  estimates  for  both 
the  predictors  and  criteria  were  computed  in  addition  to  the  absolute 
and  incremental  validities  of  each  new  predictor  test.  Summary  remarks 
noting  the  practical  significance  of  the  incremental  validity  for  the 
new  predictors  conclude  the  research  memorandum. 

TECHNICAL  CONSIDERATIONS  FOR  ASSESSING  INCREMENTAL  VALIDITY 

The  relationship  between  a  selection  test  (a  predictor)  and  a 
performance  measure  (a  criterion)  is  typically  expressed  in  terms  of 
their  correlation  (a  validity  coefficient) .  The  difficulties  that 
impact  the  estimation  of  validity  are  well  known.  Such  difficulties  are 
magnified  when  examining  incremental  validity  since  such  analysis 
involves  differences  in  validity  coefficients.  The  incremental 
validities  computed  for  this  research  memorandum  are  not  a  unique 
statistic  but  rather  the  difference  between  two  validity  coefficients. 
The  validity  of  the  ASVAB  to  predict  infantry  performance  serves  as  the 


-2- 


base  and  is  subtracted  from  the  validity  of  the  ASVAB  when  supplemented 
by  an  additional  predictor  test.  Some  of  the  technical  considerations 
affecting  the  computation  of  validities  are  briefly  discussed. 

t 

Performance  Criterion 

The  measure  of  job  performance  must  be  an  accurate  and  objective 
*  reflection  of  what  an  individual  is  required  to  perform  on  his  job.  If 

the  performance  criterion  is  not  representative  of  actual  job 
performance,  its  measurement  is  meaningless  and  its  prediction  would  be 
of  no  value. 

In  1981,  the  Joint-Service  Job  Performance  Measurement  (JPM) 
project  was  initiated  to  facilitate  the  services'  development  of  valid 
measures  of  military  job  performance.  Because  of  its  high  fidelity  to 
actual  job  performance,  hands-on  performance  of  job- sample  tasks  was 
established  as  the  benchmark  criterion  measure.  A  National  Academy  of 
Sciences  (NAS)  committee  that  provides  technical  oversight  to  and 
evaluation  of  the  joint-service  project  endorsed  the  services's 
declaration  of  hands-on  tests  as  the  benchmark  criterion: 

The  hands-on  technology  is  not  just  another  means  of 
assessing  performance.  It  is  the  only  method,  short  of 
observing  people  on  the  job,  that  elicits  the  actual 
behaviors  required  to  perform  job  tasks.... The  very 
directness  of  the  hands-on  methodology  makes  it  in  theory 
the  ideal  criterion  measure...  [3,  p.  27]. 

Other  performance  measures  were  also  developed  or  collected  as  part 
of  the  Marine  Corps  JPM  project  (e.g.,  written  job  knowledge  tests, 
training  grades,  operational  performance  ratings).  Therefore,  the 
criteria  collected  by  the  JPM  project  offer  a  diverse  array  of  perform¬ 
ance  measures  against  which  to  evaluate  the  incremental  validity  of  new 
predictor  tests.  However,  greater  emphasis  will  be  ascribed  to  the 
outcomes  associated  with  the  hands-on  performance  measures  due  to  their 
greater  fidelity  to  actual  job  behaviors. 

Aptitude  Measures 

Incremental  validity  of  new  tests  must  be  determined  relative  to 
the  existing  set  of  predictors  in  the  ASVAB.  The  complete  set  of  ASVAB 
subtests,  not  a  composite  of  the  subtests  or  a  derived  measure  of 
general  cognitive  ability,  must  be  used  as  the  validity  standard  against 
4  which  new  tests  are  judged.  This  requirement  provides  a  common  base  for 

comparison  of  validity  increments  as  well  as  recognizes  the  potential 
fallibility  of  any  composite.  Any  definition  of  the  predictor  set, 
other  than  the  full  complement  of  ASVAB  subtests,  would  possibly  lead  to 
i  underestimates  of  absolute  validity,  and  thereby  overestimates  of 

incremental  validity.  Therefore,  all  ten  ASVAB  subtests  were  used  as 
predictors  to  maximize  the  predictive  validity  currently  available  in 
the  ASVAB. 
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A  second  aptitude -relevant  issue  concerns  the  timing  of  test 
administration  for  both  the  ASVAB  and  the  new  set  of  predictors. 

Ideally,  both  the  ASVAB  and  the  new  predictors  should  be  administered  at 
the  same  time  (preferably  at  time  of  enlistment).  However,  such  a 
longitudinal  analysis  of  increments  in  validity  is  not  possible  for  the 
current  study. 

An  alternative  strategy  is  to  readminister  the  ASVAB  so  that-  -it  is 
concurrent  with  the  administration  of  the  new  predictor  tests.  This 
concurrent  administration  of  all  predictor  measures  attempts  to  control 
for  extraneous  factors.  Such  factors  may  possibly  include  gains  in  test 
performance  due  to  training,  experience,  or  individual  maturity  that  may 
have  occurred  during  the  lapse  between  testing  periods.  Also,  con¬ 
current  administration  seeks  to  minimize  motivational  differences  across 
testing  sessions.  Since  administration  of  the  new  predictors  was  not 
possible  at  the  time  of  enlistment  for  this  project,  the  ASVAB  was 
readrainistered  as  part  of  the  Marine  Corps  JPM  project  so  that 
differences  in  incremental  validity  could  be  evaluated  as  a  function  of 
enlistment  and  concurrent  aptitude. 

Correction  for  Range  Restriction 

A  validity  coefficient  computed  on  a  sample  of  job  incumbents  will 
generally  underestimate  the  true  validity  of  a  selection  test  for  the 
population  of  applicants  to  which  the  test  is  administered.  This  is 
because  the  selection  process  restricts  the  distributions  of  both 
predictor  and  criterion  scores  by  screening  out  potentially  unsuccessful 
applicants.  The  degree  of  range  restriction  differs  across 
specialties:  standards  for  low-level  jobs  would  tend  to  screen  out 
relatively  few  applicants;  standards  for  more  technically  demanding  jobs 
would  tend  to  be  more  restrictive. 

To  be  able  to  compare  validity  coefficients  across  jobs  with 
differing  degrees  of  selection,  the  coefficients  must  be  placed  on  a 
common  scale.  "Correction  for  range  restriction"  produces  this  common 
metric  by  estimating  what  the  validity  would  be  in  the  full  population 
of  potential  applicants.  The  1980  youth  population  served  as  the 
reference  population  from  which  all  corrections  for  this  research 
memorandum  were  derived  [4] .  A  multivariate  range  correction  procedure 
was  used  that  accounts  for  the  effects  of  selecting  individuals  on  all 
ten  ASVAB  subtests  [5].  Because  population  variances  are  not  available 
for  the  new  predictor  tests,  corrections  to  validity  coefficients  due  to 
range  restrictions  accounted  for  explicit  selection  only  on  the  ASVAB, 
not  the  new  predictors.  The  new  predictors  were  treated  as  incidental 
selection  variables  in  the  correction  procedures. 

ShrinkfloA  <yf  Multiple  Correlations  and  Cross  Validation 

Multiple  correlations  (MRs)  are  merely  extensions  of  simple 
correlation  coefficients  in  that  the  criterion  is  regressed  on  multiple 
predictor  measures  as  opposed  to  one.  The  square  of  the  MR  expresses 
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the  magnitude  of  the  predictive  power  of  the  regression.  Regression 
weights  are  assigned  to  each  predictor  to  maximize  the  MR  for  the  sample 
on  which  the  regression  is  computed.  If  the  regression  weights  are  then 
applied  to  a  different  sample,  the  resulting  MR  will  almost  always  be 
smaller  than  the  MR  obtained  in  the  original  sample.  This  decrement  in 
MRs  is  referred  to  as  "shrinkage." 

The  degree  of  shrinkage  is  primarily  a  function  of  the  number  of 
predictors  and  sample  size.  The  best  procedure  for  estimating  the 
degree  of  shrinkage  is  to  perform  a  cross-validation.  This  requires 
that  the  available  observations  are  split  into  two  random  samples  (one 
for  estimation  and  the  other  for  validation) .  Predicted  values  of  the 
criterion  variable  are  computed  in  the  validation  sample  based  on  the 
weights  determined  in  the  estimation  sample.  The  correlation  between 
the  actual  and  predicted  values  is  then  computed.  The  difference 
between  this  correlation  and  the  MR  in  the  estimation  sample  is  an 
estimate  of  the  shrinkage.  If  the  shrinkage  is  small  (and  MR  is 
meaningful) ,  then  the  estimation  regression  is  warranted  for  future 
predictions . 

Formula  methods  have  been  derived  to  estimate  the  degree  of 
shrinkage  in  MRs  as  opposed  to  the  computing  of  separate  regressions  on 
a  split  sample  [6].  These  formulas  make  use  of  all  observations  and 
result  in  more  precise  estimates  of  the  shrinkage. 

Computing  an  estimate  of  the  population  cross -validated  multiple 
correlation  (CVR)  is  a  two-stage  process.  First,  an  estimate  of  squared 
population  multiple  correlation  (p^)  is  computed: 


1 


N  -  1 
N  -  p  -  1 


(1  -  R2) 


(1) 


where  N  is  the  sample  size,  p  is  the  number  of  predictors,  and  is  the 
observed  squared  multiple  correlation.  This  quantity  is  then  used  as 
input  for  computing  the  CVR: 


CVR2  -  (N  -  1)  pfr  +  p2  (2) 

(N  -  p)  f  +  p 

where  all  symbols  are  defined  above.  The  square  root  of  this  quantity 
is  the  value  used  throughout  this  research  memorandum  for  computing  the 
validity  base  and  incremental  gains  due  to  the  new  predictors . 

Formula  (1)  applies  only  to  the  case  where  the  predictors  are 
considered  fixed,  as  in  a  typical  selection  and  classification 
process.  Fixed  predictors  imply  that  generalizations  based  on  the  CVR 
pertain  only  to  the  exact  set  of  predictors  under  investigation  (the  ten 
ASVAB  subtests  in  this  case)  and  not  to  a  population  of  predictors. 
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Criterion  Unreliability 

All  performance  criteria  are  not  measured  with  the  same 
reliability.  To  the  extent  that  the  criteria  are  unreliable  and  contain 
measurement  error,  estimates  of  validity  coefficients  will  also  be 
affected.  Theoretically,  a  test  cannot  correlate  with  another  variable 
more  highly  than  it  correlates  with  its  own  true  score  (a  test  score 
measured  with  no  error);  therefore,  test  validity  cannot  exceed  the 
square  root  of  test  reliability. 

It  follows  that  the  increments  in  validity  of  new  predictor  tests 
computed  against  multiple  performance  criteria  may  be  affected  by 
differences  in  criteria  reliabilities.  Corrections  to  validities  can  be 
made  to  compensate  for  unequal  measurement  reliability  (see  [7, 
p.  69]).  Such  corrected  values  are  the  maximum  coefficients  that  are 
obtainable  if  all  measurement  error  could  be  eliminated,  i.e.,  perfect 
criterion  reliability.  An  accurate  estimate  of  the  criterion 
reliability  is  essential  to  obtaining  the  proper  correction. 

The  primary  concern  for  this  research  memorandum  is  relative 
comparisons  among  validity  gains  for  new  predictors  within  a  criterion, 
not  absolute  comparisons  of  the  magnitude  of  validity  increments  across 
criteria.  The  focus  of  the  analyses  is  on  the  hands-on  performance 
measures,  and  the  other  criteria  were  examined  for  the  relative 
consistency  of  outcomes.  Therefore,  corrections  to  validity  coeffic¬ 
ients  for  criterion  unreliability  were  not  computed.  (As  will  be  shown 
in  a  later  section,  t’  differences  in  criterion  reliability  were  not  as 
discrepant  as  expected,  so  such  corrections  would  not  have  a  differ¬ 
ential  impact  on  the  results.)  However,  sufficient  information  is 
provided  in  the  tables  to  allow  such  corrections  to  be  calculated. 

Controlling  for  Time  in  Service 

As  noted  earlier,  validities  may  be  adversely  affected  by  a  time 
lapse  between  the  administration  of  the  enlistment  predictors  and  the 
new  predictors  of  interest.  To  account  for  the  possible  impact  of 
temporal  differences ,  the  ASVAB  was  readministered  so  that  all  predictor 
information  would  be  collected  at  the  same  time  and  under  the  same 
conditions. 

However,  the  examinees  of  the  JPM  sample  also  differed  with  respect 
to  their  length  of  service,  ranging  from  5  to  48  months.  Such  time 
differences  may  affect  performance  on  the  predictor  tests  and/or  the 
performance  tests  simply  due  to  on-the-job  experience,  training,  or 
maturity.  To  control  for  these  potential  developmental  effects,  a 
separate  set  of  analyses  used  time  in  service  (TIS)  and  its  square  as 
covariates  in  each  regression  before  the  new  predictor  test  was 
entered.  In  this  manner,  performance  scores  were  statistically  adjusted 
as  if  all  examinees  had  the  same  number  of  months  of  service. 
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TEST  ADMINISTRATION 


Each  Marine  was  tested  for  two  days.  One  day  was  devoted  to  hands  - 
r  on  testing  and  the  other  day  was  for  written  tests.  All  tests  were 

administered  by  retired  Marines  who  received  extensive  training  in  how 
to  administer  tests  in  a  standardized  manner  and  accurately  score  and 
record  test  performance.  The  administrators  specialized  in  giving 
M  either  the  hands-on  tests  or  the  written  tests.  Multiple  administrators 

rated  the  performance  of  selected  examinees  to  monitor  the  scoring 
consistency  and  accuracy  of  test  administrators  throughout  the  four- 
month  testing  period. 

Examinees  were  first- term  infantrymen  from  four  MOSs.  Over  1,000 
riflemen  were  tested,  and  about  300  Marines  in  each  of  the  other  three 
specialties  were  examined:  machinegunner ,  mortarman,  and  assaultman. 
Examinees  were  randomly  selected  for  testing  by  Headquarters,  Marine 
Corps,  so  that  reasonable  distributions  of  time  in  service,  paygrade, 
and  educational  level  were  obtained.  Approximately  20  percent  of  the 
riflemen  were  retested  on  all  materials  after  an  interval  of  7-10  days. 

Criterion  Measures 

Four  performance  measures  were  collected  for  each  Marine.  A 
description  of  each  measure  follows. 

Hands-on  performance  tests  (HOPT)  were  developed  for  the  four 
first- term  infantry  MOSs.  Based  on  official  Marine  Corps  publications, 
training  materials,  and  extensive  task  analyses  by  job  experts,  the 
domain  of  infantry  job  requirements  was  specified.  Tasks  were  organized 
into  relatively  homogeneous  content  areas,  called  duty  areas  (e.g.,  land 
navigation,  tactical  measures,  grenade  launcher,  squad  automatic 
weapon).  Job  requirements  differed  across  the  four  MOSs,  although  there 
was  a  large  core  of  common  infantry  tasks.  Each  MOS  had  13-14  duty 
areas.  Tasks  were  sampled  from  each  duty  area  so  that  hands-on  test 
scores  would  generalize  to  the  full  range  of  infantry  job  requirements 
within  that  duty  area  [8].  Alternate  forms  of  the  hands-on  test  were 
developed  in  response  to  test  security  concerns  and  also  to  examine  test 
reliability. 

A  written  job  knowledge  test  (JKT)  was  also  developed  to  parallel 
the  content  of  the  hands-on  test.  A  separate  written  test  composed  of 
about  200  items  was  developed  for  each  MOS.  No  time  limits  were 
imposed,  but  examinees  typically  finished  in  two  hours.  An  alternate 
form  of  the  JKT  was  also  constructed. 

Operational  Marine  Corps  supervisory  ratings,  called  proficiency 
marks  (PRO),  were  obtained  from  Headquarters.  Marine  Corps.  Proficiency 
>  marks  are  given  every  six  months  to  enlisted  personnel,  or  earlier  if  an 

individual  is  transferred  to  another  unit.  The  rating  score  used  for 
these  analyses  was  the  mean  of  all  available  proficiency  marks  for  an 
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individual.  Over  90  percent  of  the  Marines  tested  in  the  JPM  project 
had  received  at  least  three  proficiency  marks;  the  average  person  had 
received  more  than  five  ratings. 

Training  grades  (GPA)  in  the  School  of  Infantry  were  also  collected, 
from  historical  records.  Grades  could  not  be  found  for  all  Marines  who 
were  administered  the  new  predictor  tests.  Other  analyses  of  training 
grades  have  shown  that  different  relationships  exist  between  aptitude 
and  grades  for  the  two  training  locations  (Base  A  and  Base  B)  [9]; 
therefore,  the  two  bases  were  analyzed  separately. 

Predictor  Tests 

The  new  predictors  included  three  paper-and-pencil  tests,  a  video 
firing  test,  and  a  biographical  questionnaire.  Below  is  a  description 
of  each. 


The  Space  Perception  (SP)  test  was  a  paper-and-pencil  test  that 
measured  spatial  visualization.  The  test  was  administered  as  part  of 
ASVAB  5/6/7  and  was  composed  of  20  items  that  required  visualization  of 
paper-folding  and  -unfolding  tasks.  Twelve  minutes  were  allowed  to 
complete  the  test. 

The  Assembling  Objects  (AS)  test  was  obtained  from  the  Army's  JPM 
project  [10].  The  paper-and-pencil  test  was  a  measure  of  spatial 
visualization  and  mental  rotation.  There  were  36  items  and  the  time 
limit  was  18  minutes. 


The  Reasoning  (RS)  test  was  also  obtained  from  the  Army's  JPM 
project  [10]  and  was  composed  of  30  written  items  that  measured  spatial 
reasoning  and  pattern  recognition.  A  time  limit  of  12  minutes  was 
imposed. 


A  test  of  video  firing  (VF)  was  administered  to  assess  psychomotor 
skills.  The  test  required  firing  a  pistol  at  moving  targets  on  a  video 
screen.  The  test  consisted  of  four  shooting  trials  for  up  to  five 
scenarios  of  increasing  difficulty.  '  The  test  was  untimed  but  typically 
required  10-15  minutes  to  complete. 

A  shortened  version  of  the  Armed  Services  Applicant  Profile  (ASAP) 
was  also  administered.  ASAP  was  a  biographical  questionnaire  that  was 
chained  from  the  executive  agent  for  the  joint -service  instrument 
[lij.  The  administration  was  untiraed  but  required  approximately  20-30 
minutes  to  complete  the  60 -item  form. 


The  ASVAB  was  readministered  so  that  the  new  predictor  tests  could 
- —  - - - - ^v/uvw4.i.vm.  a^ni.uuc  liaOj-iuaLlUll ,  AllC  jlUxjl 

battery  was  group  administered  and  required  approximately  three  hours  to 
complete.  To  motivate  examinees  to  perform  to  the  best  of  their 
abilities,  a  strong  incentive  was  provided- -if  the  ASVAB  scores  from  the 
JPM  administration  exceed  an  individual's  scores  of  record,  the  higher 


-8- 


JPM  scores  would  be  substituted.  This  motivator  was  effective  because 
many  enlisted  personnel  seek  to  transfer  to  other  occupational  fields  or 
apply  for  the  warrant  officer  program,  which  have  higher  aptitude 
requirements.  Approximately  60  percent  of  the  Marines  who  participated 
in  the  JPM  testing  satisfied  the  necessary  criteria  and  improved  their 
aptitude  scores  of  record. 

RESULTS 

Reliability  Estimates 

Tables  1  through  4  present  the  reliability  estimates  for  three  of 
the  criterion  measures  (reliability  could  not  be  computed  for  training 
grades)  and  all  the  new  predictor  tests.  Where  appropriate,  the 
following  reliability  estimates  were  computed: 


o  Test-retest:  both  test  forms  of  the  hands-on  test  and  job 
knowledge  test  and  the  same  form  for  the  new  predictors  were 
readministered  to  about  a  20 -percent  sample  cf  the  infantry 
riflemen  after  an  interval  of  7-10  days. 

o  Alpha  coefficient:  a  measure  of  the  internal  consistency  of 
test  items  (or  tasks)  that  reflects  the  degree  to  which  item 
responses  are  homogeneous. 

o  Scorer  agreement:  the  percentage  agreement  between  two  test 
administrators  as  they  observe  and  score  the  step-level 
performance  of  one  examinee. 

o  Analysis  of  variance  (ANOVA)  reliability:  similar  to  the  alpha 
’  coefficient  in  that  the  statistic  indicates  the  consistency 
among  multiple  observations  of  the  same  performance  measure. 

The  hands-on  tests  were  found  to  be  very  reliable  (see  table  1). 
Test-retest  reliability  was  0.70.  There  was  a  significant  retest  gain 
in  performance  of  over  0.8  standard  deviation.  Such  gains  over  a  time 
period  of  7-10  days  may  reflect  the  positive  impact  of  practice  on  the 
performance  of  infantry  tasks  or  simply  a  better  understanding  of  the 
hands-on  testing  procedures.  Further  analysis  of  these  retest 
improvements  showed  that  the  gains  were  not  related  to  aptitude;  both 
high-  and  low-aptitude  personnel  made  equivalent  advances  in 
performance.  Alpha  coefficients  were  consistently  high  for  all  MOSs. 
Test  administrators  also  agreed  on  the  scoring  of  the  performance  that 
they  observed. 


As  expected,  the  written  job  knowledge  test  was  found  to  be 
slightly  more  reliable  than  the  hands-on  measures.  Table  2  shows  that 
the  test-retest  reliability  was  0.73  with  no  retest  gains 
coefficients  ranged  from  0.87  to  0.90  for  the  four  MOSs. 
difficult  test:  an  infantryman  on  average  answered  about  45  percent  of 
the  written  items  correctly. 


The  alpha 
The  JKT  was  a 
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Table  1.  Reliability  of  hands-on  performance  test 
Reliability  Reliability 

measure  estimate  Other  relevant  information 


Test-retest 

Initial  test 

Retest 

Mean  SD 

Mean  SD 

N 

Rifleman 

0.70 

52.4  8.6 

59.4  8.2 

190 

Alpha  coefficient3 

Number  of  test  items 

N 

Rifleman 

0.87  ' 

71  and  68 

tasks 

880 

Machine  gunne  r 

0.87 

72  and  70 

tasks 

257 

Mortarraan 

0.88 

75  and  72 

tasks 

217 

Assaultman 

0.83 

80  and  76 

tasks 

239 

Scorer  agreement 

Rifleman 

0.90 

Machinegunner 

0.90 

Mortarman 

0.89 

Assaultman 

0.90 

a.  Alpha  reliability  estimates  are  the  mean  for  the  two  forms  of 
the  hands-on  test.  Differences  between  the  two  coefficients 
for  any  MOS  were  never  greater  than  0.02. 


Table  2.  Reliability  of  job  knowledge  test 


Reliability  Reliability 

measure  estimate  Other  relevant  information 


Test-retest 

Initial 

test 

Retest 

Mean 

SD 

Mean  SD 

N 

Rifleman 

0.73 

43.5 

9.0 

43.8  10.5 

189 

Alpha  coefficient3 

Number 

of  test  items 

N 

Rifleman 

0.89 

199  for 

each 

test  form 

896 

Machinegunner 

0.89 

190  for 

each 

test  form 

306 

Mortarman 

0.90 

189  for 

each 

test  form 

312 

Assaultman 

0.87 

190  for 

each 

test  form 

314 

a.  Alpha  reliability  estimates  are  the  mean  for  the  two  forms  of 
the  job  knowledge  test.  Differences  between  the  two  coeffi¬ 
cients  for  any  MOS  were  never  greater  than  0.02. 
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A  simple  analysis  of  variance  design  of  subjects,  ratings,  and 
their  interaction  showed  that  proficiency  marks  were  reasonably  stable 
and  consistent.  Three  reliability  estimates  were  computed  based  on  the 
three,  four,  and  five  most  recent  ratings  that  an  individual  had 
received.  Table  3  reports  reliabilities  for  the  ratings  that  ranged 
from  0.66  to  0.70. 


Table  3.  Reliability  of  proficiency  marks 


Reliability  Reliability  Mean  squares 

measure _ estimate _ Between  Within _ N 


ANOVA  reliability 


3  most  recent  ratings 

0.66 

24.09 

8.17 

1755 

4  most  recent  ratings 

0.67 

25.54 

8.42 

1406 

5  most  recent  ratings 

0.70 

25.42 

7.67 

1104 

Given  that  the  new  predictor  tests  were  somewhat  shorter  in  length, 
their  reliabilities  tended  to  be  slightly  lower  than  those  of  the 
criterion  measures.  Table  4  shows  that  test-retest  estimates  were  high 
for  3P  and  ASAP,  and  relatively  low  for  the  other  three  tests.  The  ASAP 
is  a  factual  questionnaire,  so  such  high  reliabilities  were  expected.  A 
significant  retest  gain  of  about  0.75  standard  deviation  was  noted  for 
VF;  all  other  tests  showed  negligible  improvements.  Again,  further 
analysis  of  the  VF  retest  improvements  showed  that  they  were  not  related 
to  aptitude.  Alpha  coefficients  for  each  test  were  also  moderately 
high. 


Table  4.  Reliability  of  new  predictor  tests 
Reliability  Reliability 

measure  estimate  Other  relevant  information 


Test-retest 

Initial  test 

Retest 

Mean 

_sd_ 

Mean 

SD 

N 

SP 

0.73 

11.4 

3.9 

11.9 

4.2 

197 

RS 

0.58 

18.9 

5.8 

19.2 

6.2 

197 

AS 

0.57 

22.3 

7.2 

22.3 

8.1 

197 

VF 

0.63 

198.6 

30.3 

221.2 

.38.3 

211 

ASAP 

0.90 

5.8 

13.1 

5.2 

13.9 

192 

Alpha  coefficient 

Numbec 

■  of  test 

items 

N 

SP 

0.78 

20  items 

1837 

RS 

0.85 

30  items 

1837 

AS 

0.88 

36  items 

1837 

VF 

0.82 

4  trials 

1849 
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Estimates  of  New  Predictor  Uniqueness 

A  necessary,  but  not  sufficient,  condition  for  new  predictors  to 
demonstrate  increments  in  validity  is  that  the  new  tests  need  to  measure 
aptitudes  that  are  somewhat  unique  relative  to  the  ASVAB.  Predictors 
that  have  high  correlations  with  ASVAB  can  improve  validity  only  by 
enhancing  test  reliability,  which  is  unlikely  given  the  already  high 
ASVAB  reliabilities.  New  tests  that  measure  unique  aptitudes  have 
potential  for  incremental  validity. 

The  uniqueness  (U)  of  a  new  test  is  defined  as  the  reliable 
variance  of  the  test  that  is  not  related  to  ASVAB: 

U  -  Rel(NP)  -  R2(NP,  ASVAB)  (3) 

where,  Rel(NP)  is  the  reliability  of  the  new  predictor  test  (NP) ,  and 
R2(NP,  ASVAB)  is  the  squared  multiple  correlation  for  the  regression  of 
the  new  predictor  test  on  all  ASVAB  sub tests  adjusted  for  shrinkage. 

The  estimates  of  uniqueness  for  each  new  predictor  test  are  presented  in 
table  5.  These  estimates  were  computed  based  on  both  enlistment  and 
concurrent  aptitude  information  using  test-retest  as  the  measure  of, 
reliability. 


Table  5.  Uniqueness  estimates3,  for  new 
predictor  tests  relative  to  enlistment 
and  concurrent  aptitude  scores 


New 

predictor 

test 

Aotitude 

scores 

Enlistment 

Concurrent 

SP 

0.39 

0.36 

RS 

0.25 

0.20 

AS 

0.29 

0.25 

VF 

0.40 

0.39 

ASAP 

0.81 

0.78 

a.  Estimates  were  based  on  test-retest 
reliability  of  new  predictors  and 
multiple  correlations  of  the  new 
predictors  regressed  on  all  ASVAB 
sub tests.  Reliabilities  and 
multiple  correlations  were  corrected 
for  range  restriction. 


* 


Th^re  was  essentially  no  difference  in  the  uniqueness  estimate 
based  on  enlistment  and  concurrent  aptitude.  The  ASAP  showed  the 
highest  uniqueness  due  to  both  its  high  test-retest  reliability  and  lack 
of  relationship  with  the  ASVAB  subtests.  Video  firing  and  the  space 
perception  test  were  comparable  with  moderate  levels  of  uniqueness;  the 
reasoning  and  assembling  objects  tests  showed  the  least  promise  of 
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having  unique  and  reliable  variance.  From  the  uniqueness  perspective, 
ASAP,  video  filing,  and  space  perception  would  be  the  best  candidate 
tests  for  possibly  improving  the  validity  of  the  ASVAB  against  infantry 
job  performance. 

Intercorrelations  and  First-Order  Validity 

The  intercorrelations  among  the  new  predictors  were  examined  to 
determine  the  degree  to  which  the  tests  measured  the  same  concept.  The 
relationship  between  the  new  predictors  and  ASVAB  as  well  as  the 
validity  of  each  test  with  five  performance  criteria  were  computed. 
Table  6  reports  these  results  for  the  infantry  rifleman.  The  correla¬ 
tions  are  corrected  for  range  restriction;  sample  and  corrected 
correlation  values  are  reported  in  appendix  A  for  each  MOS. 


Table  6.  Correlations  of  infantry  rifleman  criteria  and  predictors 
corrected  for  range  restriction 


Criterion _  _ Predictor 


HOPT 

JKT 

PRO 

GPA  Aa 

GPA  B* 

SP 

RS 

AS 

VF 

ASAP 

Enlistment 

AFQT 

0.56 

0.77 

0.34 

0.61 

0.40 

0.47 

0.60 

0.47 

0.42 

0.27 

GT 

0.63 

0.78 

0.35 

0.65 

0.40 

0.55 

0.63 

0.54 

0.49 

0.23 

ASVABb 

0.67 

0.80 

0.38 

0.66 

0.41 

0.61 

0.65 

0.59 

0.54 

0.33 

Concurrent 

AFQT 

0.58 

0.81 

0.38 

0.61 

0.40 

0.50 

0.63 

0.52 

0.44 

0.29 

GT  . 

0.63 

0.80 

0.39 

0.63 

0.41 

0.56 

0.67 

0.58 

0.49 

0.26 

ASVAB0 

0.69 

0.83 

0.41 

0.67 

0.42 

0.64 

0.69 

0.63 

0.55 

0.37 

Predictors 

SP 

0.45 

0.46 

0.23 

0.37 

0.24 

1.00 

0.54 

0.59 

0.38 

0.10 

RS 

0.47 

0.59 

0.29 

0.43 

0.33 

0.54 

1.00 

0.63 

0.40 

0.21 

AS 

0.47 

0.55 

0.23 

0.41 

0.23 

0.59 

0.63 

1.00 

0.40 

0.17 

VF 

0.49 

0.42 

0.27 

0.44 

0.24 

0.38 

0.40 

0.40 

1.00 

0.11 

ASAP 

0.22 

0.29 

0.31 

0.14 

0.09 

0.10 

0.21 

0.17 

0.11 

1.00 

Mean 

52.80 

44.35 

43.69 

49.83 

50.13 

11.01 

18.76 

22.03 

196.1 

6.56 

SD 

10.22 

12.08 

2.19 

11.62 

10.51 

4.32 

6.40 

7.86 

33.71 

13.03 

N 

870 

862 

870 

512 

641 

870 

870 

870 

870 

870 

a.  Statistics  for  GPA  include  examinees  from  other  MOS s* 

b.  The  correlations  and  validities  for  ASVAB  represent  multiple 
correlations  based  on  all  ASVAB  subtests. 
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Three  major  observations  were  drawn  from  table  6.  First,  the  three 
paper-and-pencil  measures  of  spatial  ability  (SP,  RS,  and  AS)  were 
highly  correlated  (0.54  to  0.63).  The  video  firing  test  was  moderately 
related  to  the  spatial  tests  and,  as  expected,  the  ASAP  was  not  overly 
related  to  any  of  the  other  predictor  measures.  Second,  the  inter¬ 
correlations  between  the  new  predictors  and  the  existing  ASVAB  subtests 
showed  RS  to  be  most  highly  related  to  ASVAB,  and  ASAP  the  least 
related.  The  results  were  consistent  for  both  enlistment  and  concurrent 
aptitude  scores.  Third,  the  pattern  of  validities  between  the  new  tests 
and  the  five  performance  criteria  were  very  similar:  ASAP  was  least 
related  to  each  performance  criteria;  all  other  new  predictors  were 
about  equally  related  to  the  performance  measures.  Similar  correlations 
were  noted  for  the  other  MOSs  that  are  reported  in  appendix  A. 

The  multiple  correlations  noted  in  table  6  between  ASVAB  and  each 
performance  criterion  provided  the  base  against  which  all  judgments  of 
incremental  validity  were  made.  The  validities  show  that  ASVAB  was 
highly  related  to  JKT  (0.80),  HOPT  (0.67),  and  GPA  for  Base  A  (0.66). 

The  ASVAB  was  moderately  related  to  PRO  (0.38)  and  GPA  for  Base  B 
(0.41).  Similar  validities  were  noted  for  concurrent  aptitude 
information.  The  new  tests  would  have  to  demonstrate  improvements  in 
validity  above  and  beyond  these  levels  that  ASVAB  is  currently  able  to 
achieve . 

Incremental  Validity 

Tables  7  through  11  report  the  ASVAB  validity  base  (taken  from 
table  6)  and  the  validity  increments  due  to  each  new  predictor  test.  A 
separate  table  is  reported  for  each  MOS .  The  tables  contain  the 
following  information: 

o  Multiple  correlations  (MR) ,  sample  validities ,  and  validities 
corrected  for  range  restriction 

o  Estimates  of  the  cross -validated  multiple  correlations  (CVR) 

o  Increment  (IN)  in  the  cross -validated  multiple  correlation  over 
the  ASVAB  validity  base  due  to  the  new  predictor 

o  Increment  expressed  as  a  percentage  improvement  (%)  over  the 
ASVAB  base  (IN  divided  by  ASVAB-base  CVR) . 

Grade  point  average  was  combined  for  all  four  MOSs  and  reported  in  a 
separate  table  because  all  individuals  received  the  same' initial 
infantry  training.  Findings  are  reported  for  both  enlistment  and 
concurrent  aptitude  information. 
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Table  8.  Increments  in  validity  by  new  prddictbr  tests  for  infantry  mach i negunner  performance 
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Increment  in  cross-validated  multiple  correlation  by  new  test  was  negative 
due  to  adjustment  made  for  shrinkage. 


There  were  occasional  instances  in  which  the  increments  in  the  CVR 
due  to  the  new  predictor  tests  were  negative.  This  is  due  to 
adjustments  that  are  made  in  computing  the  CVR  to  account  for  the 
additional  predictor.  For  those  cases  in  which  the  change  in  CVR  was 
negative,  the  additional  predictor  did  not  improve  the  overall 
validity. 

The  analyses  focused  on  the  rifleman  MOS  because  over  1,000  were 
tested  as  part  of  the  JPM  project.  Complete  criterion  and  predictor 
information  was  available  for  approximately  870  riflemen.  Complete  data 
for  the  other  three  infantry  specialties  were  collected  on  less  than  250 
examinees.  Due  to  the  potential  impact  of  sampling  errors  on  computing 
differences  in  validity  coefficients  for  specialties  with  relatively 
small  samples,  more  emphasis  was  placed  on  the  rifleman  findings. 

Enlistment  Versus  Concurrent  Aptitude 

The  magnitude  of  the  CVRs  was  greater  for  the  concurrent  than 
enlistment  CVRs  (see  tables  7  through  11).  However,  the  increments  in 
CVRs  were  less  for  concurrent  than  enlistment  aptitude  scores.  Given 
this  combination  of  a  higher  validity  base  but  lower  increments,  the 
percentage  change  for  increments  in  validity  was  lower  for  concurrent 
than  for  enlistment  aptitude  scores.  Therefore,  the  concurrent 
administration  of  the  ASVAB  does  appear  to  account  for  some  error 
sources  resulting  from  time  differences  between  the  enlistment  aptitude 
and  the  administration  of  the  new  predictors. 

The  percentages  for  validity  increments  based  on  concurrent 
aptitude  scores  were  typically  half  as  large  as  the  percentage 
increments  shown  against  enlistment  aptitude  scores.  Figures  1  and  2 
plot  the  percentage  increments  in  the  validity  of  all  rifleman 
performance  measures.  The  controlling  effect  of  concurrent  altitude  was 
to  increase  the  magnitude  of  the  CVRs  while  reducing  the  validity  gains 
due  to  the  new  predictor  tests.  Despite  differences  in  incremental 
validities  based  on  enlistment  versus  concurrent  aptitude  scores,  the 
rank  ordering  of  the  new  predictors  yielding  the  largest  validity  gains 
was  not  affected. 

Best  New  Predictor  for  Each  Criterion 

Table  12  summarizes  the  information  presented  in  tables  7  through 
11  by  highlighting  the  best  single  new  predictor  test  against  each 
criterion  for  all  four  MOSs.  Several  consistent  trends  emerged. 
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Figure  1 .  Percentage  increment  in  validity  for  infantry  rifleman  performance: 
enlistment  aptitude  scores 
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NOTE:  Increases  for  training  grades  include  data  from  other  MOSs. 


Figure  2.  Percentage  increment  in  validity  for  infantry  rifleman  performance: 
concurrent  aptitude  scores 


-21- 


Table  12.  Best  new  predictor  test  for  each 
criterion  and  MOS 


Criterion 


MOS 

HOPT 

JKT 

PRO  GPA 

Rifleman 

VF 

AS 

ASAP  VFa 

Machine  gunne  r 

SP 

AS 

ASAP 

Mortarman 

AS 

AS,  S? 

ASAP 

Assaultman 

VF,  AS 

AS 

ASAP 

a.  Validity  results  against  GPA  were  based 
on  examinees  from  all  MOSs.  Findings 
were  consistent  for  both  training  loca¬ 
tions  . 


Several  new  predictor  tests  resu?.ted  in  the  largest  increments  in 
validity  against  HOPT  for  the  four  MGSs.  Th^se  findings  were  consistent 
with  the  differences  in  job  requirements,  which  were  reflected  in 
differences  in  hands-on  test  content  for  these  specialties.  The 
hands-on  test  for  the  rifleman  specialty  required  each  Marine  to 
negotiate  an  unknown  trail  as  if  on  a  squad  patrol  and  to  engage  popup 
targets  with  the  M16A2  rifle  The  prediction  of  accurately  hitting 
these  targets  and  other  rifleman  tasks  was  most  improved  by  the  video 
firing  (VF)  test.  Similarly  for  the  assaultman  MOS,  each  Marine  was 
required  to  fire  the  Launch  Effects  Trainer  (LET)  from  the  sitting-, 
kneeling-,  and  standing- supported  positions.  This  laser  trainer 
simulated  the  actual  firing  of  the  Dragon  missile.  Again,  the  VF  test 
was  one  of  the  better  new  predictors  in  improving  the  assaultman 
validity;  the  assembling  objects  test  (AS)  also  was  found  to  enhance  the 
validity.  Job  requirements  for  the  machinegunner  and  mortarman 
specialties,  tended  to  be  more  spatially  oriented.  Machine gunners  were 
required  to  establish  intersecting  fields  of  fire  as  well  as  to  prepare 
range  cards  that  document  direction,  elevation,  and  range  of  targets. 

The  space  perception  (SP)  test  was  found  to  be  the  best  new  predictor  in 
improving  the  prediction  of  machinegunner  job  performance.  The 
mortarman  hands-on  test  required  the  Marine  to  complete  many  procedural 
requirements  in  mounting,  boresighting,  and  laying  the  mortar.  The  AS 
test  resulted  in  the  most  incremental  validity  for  this  specialty. 

The  JKTs  for  each  MOS  contained  many  common  infantry  items  although 
each  test  also  had  some  items  that  were  unique.  AS  was  found  to  be  the 
best  new  predictor  test  in  improving  the  validity  against  each  JKT. 

Such  a  consistent  outcome  may  be  due  to  the  dominance  of  test  content 
similarity  for  the  core  infantry  tasks  of  these  specialties. 
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The  ASVAB  only  moderately  predicted  PRO  marks.  The  ASAP  was 
invariably  the  best  new  predictor  for  improving  the  validity  for  these 
supervisor  ratings.  Because  of  the  low  ASVAB  validity  base  in 
predicting  PRO,  most  of  the  percentage  increments  are  large.  Despite 
such  significant  percentage  improvements,  the  absolute  validities 
against  PRO  marks  were  still  relatively  low. 

Validity  Increments  Controlling  for  Time  in  Service 

Time  in  service  and  its  square  were  entered  into  the  regressions 
along  with  the  ASVAB  subtests  as  the  incremental  validity  of  each  new 
predictor  test  was  redetermined.  Detailed  tables  of  the  absolute  and 
incremental  validities  are  reported  in  appendix  B  and  summarized  here. 

The  net  effect  of  including  time  in  service  in  the  regression  was  a 
rather  substantial  increase  in  the  absolute  validity  for  HOPT  and  PRO 
but  not  for  JKT.  In  other  words,  experience  had  a  strong  effect  on  the 
level  of  an  individual’s  HOPT  and  PRO  scores  while  individuals  perform 
at  comparable  levels  on  the  JKT  despite  any  differences  in  experience. 

It  followed  that  controlling  for  time  in  service  also  tended  to  reduce 
the  percentage  increment  of  the  validity  gain  due  to  the  new 
predictor.  However,  despite  this  reduction  in  percentage  gains,  the 
best  set  of  new  predictors  for  each  criterion  was  the  same  as  previously 
determined  for  enlistment  and  concurrent  aptitude  (as  shown  in  table 
12). 


Summary 

Several  corrections  were  made  to  the  validity  coefficients  to 
account  for  the  impact  of  various  extraneous  sources  of  error.  Such 
corrections  tended  to  significantly  reduce  the  gains  in  validity  due  to 
the  new  predictor  test.  Table  13  summarizes  the  impact  of  these 
corrections  by  reporting  means  and  standard  deviations  of  the  percentage 
increments  across  all  new  predictors  and  MO Ss  (N  equals  at  least  20  for 
each  cell  of  the  table--four  MOSs  and  five  new  predictor  tests).  Given 
the  extreme  magnitude  of  the  results  for  proficiency  marks,  they  are  not 
included  in  this  table. 


Incremental  validities  corrected  for  range  restriction  were 
typically  half  as  large  as  the  sample  incremental  validities,  a  mean 
percentage  increment  of  1.0  percent  versus  2.0  percent.  Increments 
based  on  concurrent  aptitude  were  likewise  less  than  gains  computed  for 
enlistment  aptitude:  a  mean  percentage  increment  of  1.2  percent  versus 
2.8  percent  for  differences  in  observed  validities,  and  a  mean 
percentage  increment  of  0.6  percent  versus  1.3  percent  for  differences 
in  corrected  validities.  Adjustments  for  time  in  service  reduced  even 
further  both  absolute  and  percentage  increments  (these  figures  are  not 
summarized  in  table  13).  The  impact  of  these  error  sources  highlights 
the  potential  for  considerable  overestimation  of  incremental  validities 
if  appropriate  corrections  and  adjustments  are  not  made. 
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Table  13.  Means  and  standard  deviationsa  of  percentage 
gains  in  incremental  validity  for  all  new  predictor 
tests  and  MOSs 


Observed _  _ Corrected 


Enlistment 

Concurrent 

Enlistment 

Concurrent 

HOPT 

3.0 

(2.6) 

1.4 

(1.1) 

1.6 

(1.4) 

0.9 

(0.7) 

JKT 

3.2 

(2.9) 

1.1 

(1.3) 

1.3 

(1.2) 

0.6 

(0.6) 

GPA 

1,7- 

-(2.1) 

lu&_ 

(1.3) 

0,6 

(0.7) 

0,3 

-(0,4) 

2,8 

(2.7) 

1.2 

(1.2) 

1,3 

(1,2) 

0.6 

(0.6) 

2.0 

(2.2) 

1.0 

(1.0) 

NOTE:  Standard  deviations  are  in  parentheses. 

a.  For  HOPT  and  JKT,  means  and  standard  deviations  are 
computed  over  four  MOSs  and  five  new  predictor  tests 
(N  equals  20  for  each  cell) .  For  GPA,  the  statistics 
are  computed  over  two  bases  and  five  new  predictors 
(N  equals  10  for  each  cell) . 


A  final  point  of  interest  is  the  magnitude  of  increments  in 
validity.  These  analyses  have  been  based  on  the  use  of  all  ASVAB 
subtests  in  the  prediction  of  infantry  performance,  while  in  practice 
classification  decisions  are  based  on  aptitude  composites.  As  stated 
earlier,  the  GT  composite  is  used  for  the  specialties  of  the  infantry 
occupational  field.  Table  6  shows  the  GT  validities  for  multiple 
criteria  for  the  rifleman  specialty.  The  ASVAB  validity  bases  are  also 
reported.  The  differences  between  these  validities  computed  for  GT 
versus  the  ASVAB  demonstrate  the  current  inefficiency  of  the  infantry 
classification  system.  By  simply  using  a  more  optimal  classification 
approach  with  all  ASVAB  subtests,  validity  gains  in  the  range  of  2  to  10 
percent  could  be  achieved  against  multiple  criteria.  Similar  validity 
gains  of  6  percent  were  achieved  with  the  recent  change  in  definition  of 
the  Armed  Forces  Qualification  Test  (AFQT)  [12].  Increments  in  validity 
have  been  achieved  in  the  past  by  revising  composite  definitions  and 
still  remain  to  be  captured  by  further  changes  in  the  current 
classification  system. 

CONCLUSIONS  •  i 

Data  from  the  Marine  Corps  JPM  project  allowed  for  a  thorough 
examination  of  the  measurement  and  prediction  of  infantry  performance. 

These  analyses  showed  that  the  ASVAB  does  an  excellent  job  predicting 
a  variety  of  infantry  performance  measures- -hands-on  performance  tests, 
written  job  knowledge  tests,  and  infantry  school  training  grades.  ASVAB 
moderately  predicts  an  infantryman* s  proficiency  rating.  The  ability  of 
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any  new  predictor  test  to  enhance  the  ASVAB 1 s  ability  to  predict 
infantry  performance  was  slight  and  mixed  (except  for  proficiency  marks, 
which  are  questionable  as  objective  measures  of  job  performance). 

The  estimation  of  validity  coefficients  is  influenced  by  a  variety 
of  factors:  restriction  of  score  distributions  due  to  the  selection 
process,  shrinkage  in  multiple  correlations  when  applying  optimal 
regression  weights  to  other  samples,  criterion  unreliability,  time  of 
administration  for  the  predictors,  etc.  The  impact  of  these  factors  as 
well  as  sampling  errors  on  validity  coefficients  is  even  further 
magnified  when  the  primary  issue  is  the  difference  between  validity 
coefficients.  Efforts  were  taken  to  account  for  several  potential  error 
sources  in  the  estimation  of  validity  coefficients.  Such  corrections 
and  adjustments  tended  to  significantly  reduce  the  gains  in  validity  due 
to  the  new  predictor  test. 

Substantial  overestimation  of  incremental  validities  is  possible  if 
appropriate  corrections  and  adjustments  are  not  made.  Further 
corrections  for  criterion  unreliability  are  necessary  if  policymakers 
are  concerned  about  the  absoluteness  of  incremental  /alidities  (as  would 
be  the  case  for  a  cost-benefit  type  of  analysis)  versus  the  relative 
comparison  among  many  new  predictors  to  determine  which  has  the  greatest 
potential  for  improving  ASVAB  validity. 

The  collection  of  concurrent  aptitude  information  has  important 
implications  for  the  design  of  future  incremental  validity  research. 

The  written  ASVAB  requires  about  three  to  four  hours  to  administer;  the 
computerized  adaptive  version  can  be  completed  in  about  two  hours.  This 
is  a  significant  time  commitment  which,  if  concurrent  aptitude 
information  is  not  necessary,  could  be  devoted  to  the  administration  of 
additional  new  predictor  tests.  The  results  of  these  analyses  show  that 
concurrent  aptitude  was  necessary  to  control  for  intervening  factors 
between  the  administrations  of  the  ASVAB  and  the  new  predictors. 

Although  there  was  a  high  correlation  between  enlistment  and  concurrent 
aptitude  scores,  approximately  60  percent  of  the  infantrymen  improved 
their  scores  of  record  by  about  two-thirds  of  a  standard  deviation. 

These  gains  in  aptitude  could  be  the  result  of  training,  on-the-job 
experiences,  or  additional  education.  This  requirement  for  concurrent 
aptitude  information  should  be  even  stronger  for  more  technically 
demanding  specialties  where  training  and  job  experience  are  even  more 
intensive  than  for  the  infantry  occupational  field. 

The  Marine  Corps  was  also  able  to  enhance  the  motivation  of  the 
infantrymen  taking  ASVAB  by  changing  their  scores  of  record  if  they 
improved.  This  incentive  was  critical  to  the  collection  of  accurate 
concurrent  aptitudes  and  also  should  be  incorporated  into  any  future 
incremental  validity  research. 

Given  the  variability  of  incremental  validity  estimates  across  MOSs 
and  criteria,  it  is  difficult  to  make  a  strong  recommendation  as  to 
which,  if  any,  of  the  new  predictors  should  be  considered  for  possible 
inclusion  in  the  ASVAB.  Although  similar  percentage  gains  found  in 
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other  research  have  been  noted  to  possibly  have  considerable  dollar 
value  [13],  any  true  benefit  that  would  result  in  fiscal  savings  has  yet 
to  be  demonstrated  [12].  Therefore,  the  slight  validity  gains  found  in 
these  analyses  have  yet  to  demonstrate  any  tangible  significance  that 
would  positively  impact  the  overall  manpower  selection  and 
classification  process. 

Even  if  “significant"  increments  in  validity  had  been  noted, 
further  investigation  of  the  measurement  properties  of  any  new  tests  is 
still  required.  For  example,  while  the  video  firing  test  tended  to  be 
one  of  the  better  tests  against  hands-on  performance,  the  test  may  be 
susceptible  to  practice  effects  as  demonstrated  in  the  significant  test- 
retest  gains  over  the  period  of  7-10  days.  Performance  on  such  video 
tests  may  also  be  affected  by  previous  experience  with  video  games  or 
computers.  Such  practice  effects  or  experience  may  possibly  cancel  any 
validity  gains  if  the  test  were  used  for  operational  testing. 

Additional  issues  that  would  need  to  be  researched  include  subgroup 
analysis,  coaching  and  test-taking  strategies,  and  logistical  concerns 
for  implementing  the  test  within  an  operational  testing  program. 

Given  the  challenge  to  improve  the  prediction  of  infantry 
performance,  it  was  found  that  larger  percentage  gains  can  be  achieved 
by  refining  the  current  aptitude  composites  or  by  using  an  optimal 
classification  system  based  on  all  ASVAB  sub tests  than  can  be  achieved 
by  adding  new  predictor  tests  to  the  ASVAB,  Such  gains  may  be  achieved 
by  simply  correcting  known  inefficiencies  in  the  current  classification 
system.  With  only  minimal  gains  resulting  from  new  predictor  tests  and 
an  unknown  benefit  associated  with  such  small  gains,  it  would  be  mere 
prudent  to  concentrate  on  refining  the  existing  classification  system. 
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APPENDIX  A 


SAMPLE  AND  CORRECTED  CORRELATIONS  OF 
INFANTRY  CRITERIA  AND  PREDICTORS 


Correlations  among  the  Marine  Corps  aptitude  composites  and  all  new 
predictor  tests  are  presented  in  this  appendix.  The  aptitude  composites 
computed  by  the  Marine  Corps  are  General  Technical  (GT) ,  Mechanical 
Maintenance  (MM),  Electronics  Repair  (EL),  Clerical/Administrative  (CL), 
and  the  Armed  Services  Qualification  Test  (AFQT) .  The  five  new 
predictor  tests  are  space  perception  (SP),  reasoning  test  (RS), 
assembling  objects  (AS),  video  firing  (VF) ,  and  the  Armed  Services 
Applicant  Profile  (ASAP) . 

Separate  tables  are  presented  for  each  MOS  and  each  performance 
measure:  hands-on  performance  test  (HOPT) ,  job  knowledge  test  (JKT) , 
and  proficiency  mark  (PRO).  Grade-point  average  (GPA)  is  reported  in 
separate  tables  because  all  MOSs  had  the  same  initial  training.  Sample 
as  well  as  corrected  correlations  are  presented.  Descriptive  statistics 
are  also  presented  for  each  variable. 


Table  A-1 .  Correlation  matrix  for  hands-on  performance  test  (sample  values):  infantry  rifleman  (0311) 
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Table  A-2 .  Correlation  matrix  for  job  knowledge  test  (sample  values):  infantry  rifleman  (0311) 
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Table  A— 3.  Correlation  matrix  for  proficiency  marks  (sample  values):  infantry  rifleman  (0311) 
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Toble  A— 4 .  Correlation  matrix  for  hands-on  performance  test  (sample  values):  mach i negunner  (0331) 
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Table  A-5.  Correlation  matrix  for  job  knowledge  test  (sample  values):  machi negunner  (0331) 
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Table  A-6.  Correlation  matrix  for  proficiency  marks  (sample  values):  mach i negunner  (0331) 
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Table  A-7.  Correlation  matrix  for  hands-on  performance  test  (sample  values):  mortarman  (0341) 
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Table  A-8.  Correlation  matrix  for  iob  knowledge  test  (sample  values):  mortarman  (0341) 
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Table  A-16.  Correlation  matrix  for  hanHs-on  performance  te3t  (sample  values);  assouiiman  (0351) 
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Toble  A-1 1 .  Correlation  matrix  for  job  knowledge  test  (sample  values):  assaultman  (0351) 
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Table  A-12.  Correlation  matrix  for  proficiency  marks  (sample  values):  assaultman  (0351) 
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Table  A-13.  Correlation  matrix  for  grade  point  average  from  infantry  training  school  (sample  values):  base 
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Table  A-14.  Correlation  matrix  for  grade  point  average  from  infantry  training  school  (sample  values)*:  base 
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Table  A-15.  Corrected  correlation  matrix  for  hands-on  performance  test:  infantry  rifleman  (0311) 
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Table  A— 16.  Corrected  correlation  matrix  for  job  knowledge  test:  infantry  rifleman  (0311) 
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Table  A— 17.  Corrected  correlation  matrix  for  proficiency  marks:  infantry  rifleman  (0311) 
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Table  A-18.  Corrected  correlation  matrix  for  hands-on  performance  test:  mach i negunne r  (0331) 
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Toble  A-19.  Corrected  correlation  matrix  for  job  knowledge  test:  mach i negunner  (0331) 


% 


A-20 


Table  A-20.  Corrected  correlation  matrix  for  proficiency  marks:  mach i negunner  (0331) 
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Table  A-21 .  Corrected  correlation  matrix  for  hands-on  performance  test:  mortarman  (0341) 
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Toble  A-22.  Corrected  correlation  motrix  for  job  knowledge  test:  mortarman  (0341) 
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Table  A-23.  Corrected  correlation  matrix  for  proficiency  marks:  mortarman  (0341) 
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Table  A-24 .  Corrected  correlation  matrix  for  hands-on  performance  test:  assaultman  (0351) 
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Table  A-25.  Corrected  correlation  matrix  for  job  knowledge  test:  assaultman  (0351) 
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Table  A-26.  Corrected  correlation  matrix  for  proficiency  marks:  assaultman  (0351) 
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Table  A-27.  Corrected  correlation  matrix  for  grade  point  average  from  infantry  training  school:  base 
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Table  A-28.  Corrected  correlation  matrix  for  grade  point  average  from  infantry  training  school:  base 
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APPENDIX  B 


SAMPLE  AND  CORRECTED  INCREMENTS  IN  VALIDITY  BY  NEW 
PREDICTOR  TESTS,  CONTROLLING  FOR  TIME  IN  SERVICE 


APPENDIX  B 


SAMPLE  AND  CORRECTED  INCREMENTS  IN  VALIDITY  BY  NEW 
PREDICTOR  TESTS,  CONTROLLING  FOR  TIME  IN  SERVICE 


The  tables  of  this  appendix  report  the  ASVAB  validities  and  the 
validity  increments  due  to  each  new  predictor  test  for  the  regressions 
in  which  time  in  service  has  first  been  entered  as  a  predictor.  A 
separate  table  is  reported  for  each  MOS.  The  tables  contain  the 
following  information: 

o  Multiple  correlations  (MR),  sample  validities,  and  validities 
corrected  for  range  restriction 

o  Estimates  of  the  cross-validated  multiple  correlations  (CVR) 

o  Increment  (IN)  in  the  cross -validated  multiple  correlation  over 
the  ASVAB  and  time- in-service  validity  base  due  to  the  new 
predictor 

o  Increment  expressed  as  a  percentage  improvement  (%)  over  the 
ASVAB  and  time- in-service  base. 

Grade-point  average  is  combined  for  all  four  MOSs  and  reported  in  a 
separate  table  because  all  individuals  received  the  same  initial 
infantry  training.  Findings  are  reported  for  both  enlistment  and 
concurrent  aptitude  information. 

There  were  occasional  instances  in  which  the  increments  in  the  CVR 
due  to  the  new  predictor  test  were  negative.  This  is  due  to  adjustments 
that  are  made  in  computing  the  CVR  to  account  for  the  additional 
predictor.  For  those  cases  in  which  the  change  in  CVR  was  negative,  the 
additional  predictor  did  not  improve  the  overall  validity. 


Table  B-1 .  Increments  in  validity  by  new  predictor  tests  for  infantry  rifleman  performance. 
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Table  8-2.  Increments  in  validity  by  netf  predictor  tests  for  Infor^ry  mach i negunner  performance, 
adjusted  for  time  in  service 
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Table  B-5.  Increments  in  validity  by  new  predictor  tests  for  infantry  troining  grades. 
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